Systems and methods for secure data aggregation and computation

ABSTRACT

Systems and methods for data aggregation and processing are provided in manner that is decentralized and preserves privacy. A data aggregation and computation system may include an interface, a controller, and one or more clusters of computation nodes. The interface may receive an inquiry from a requesting entity for computing information regarding an individual based on pieces of information held by a plurality of entities. The controller may communicate an identifier for the individual to a processor system associated with each of the entities based on the inquiry. The clusters of computation nodes may each receive encrypted data fragments from each of the processor systems, the data fragments comprising unrecognizable fragments that no individual processor system can re-assemble to recover the information, perform secure, multi-party computations based on the data fragments, and generate a result based on the secure, multi-party computations for the individual.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation of U.S. application Ser. No.16/738,942 filed Jan. 9, 2020 and titled “SYSTEMS AND METHODS FOR SECUREDATA AGGREGATION AND COMPUTATION,” which claims priority benefit to U.S.provisional Application No. 62/791,554, filed Jan. 11, 2019 and titled“SYSTEMS AND METHODS FOR SHARE-NOTHING DATA AGGREGATION ANDCOMPUTATION,” which is incorporated by reference herein in its entiretyfor all purposes.

BACKGROUND Field

The present development relates to secured multi-party computing systemsand methods, and, specifically, to calculating various attributes andvalues among various entities without requiring the entities to shareconfidential data.

Description of Related Art

In the current age of technology and as smart devices are more closelyintegrated with daily lives of people across the globe, data is quicklybecoming more valuable, and with the increased value, more protected bythose that obtain and/or accrue the data. Entities that do obtain and/orhave the data are often unwilling to share that data with other entitiesin view of many fears, including the risk and/or liability of a databreach, privacy concerns of those whose data the entities have, or riskof being replaced by those entities with which they share the data.However, many of these entities often are required to work together. Forexample, banks must often exchange users' information as required fordaily transactions, which may not be desirable for the banks.

Accordingly, improved systems, devices, and methods for efficiently andeffectively enabling secured multi-party computing to aggregate butwithout requiring sharing confidential data are desirable.

SUMMARY

Various implementations of methods and devices within the scope of theappended claims each have several aspects, no single one of which issolely responsible for the desirable attributes described herein.Without limiting the scope of the appended claims, some prominentfeatures are described herein.

Details of one or more implementations of the subject matter describedin this specification are set forth in the accompanying drawings and thedescription below. Other features, aspects, and advantages will becomeapparent from the description, the drawings, and the claims. Note thatthe relative dimensions of the following figures may not be drawn toscale.

One aspect the present disclosure described herein includes a dataaggregation and computation system. The system comprises an interfaceconfigured to receive an inquiry from a requesting entity for computinginformation regarding an individual based on pieces of information heldby a plurality of entities. The system further comprises a controllerconfigured to communicate an identifier for the individual to aprocessor system associated with each of the entities based on theinquiry. The system further comprises one or more clusters ofcomputation nodes. Each cluster is configured to receive encrypted datafragments from one or more of the processor systems. The processorsystems are each configured to generate one or more encrypted datafragments based on processing one or more of the pieces of informationheld by an entity associated with the respective processor system. Theencrypted data fragments comprise unrecognizable fragments that noindividual processor system can re-assemble to recover the one or morepieces of the information. Each cluster is also configured to performsecure, multi-party computations based on the data fragments receivedfrom each of the processor systems. Each cluster is further configuredto generate a result based on the secure, multi-party computations forthe individual and communicate the result to the controller. Thecontroller is further configured to generate a response and provide theresponse to the interface for providing to the requesting entity.

In some aspects, the plurality of entities comprise one or more of afinancial institution, a healthcare institution, or a consumer datainstitution.

In some aspects, the system further comprises an identifier database andthe controller is further configured to identify respective identifiersof the individual for each of the plurality of entities based on theinquiry and communicate the respective identifiers to the processorsystem associated with each of the entities.

In some aspects, each of the one or more clusters of computation nodesis further configured to receive the data fragments for furtherprocessing in aggregate from the processor system associated with eachof the entities, and wherein each processor system is further configuredto perform initial computations on individual pieces of informationbefore generating the data fragments.

In some aspects, the controller is further configured to identify theinitial computations performed by the processor system associated witheach of the entities and the secure, multi-party computations performedby the one or more clusters of computation nodes.

In some aspects, the controller is further configured to identify aquantity of computation nodes in the one or more clusters that performthe secure, multi-party computations, wherein the quantity is based on adesired security level.

In some aspects, the inquiry comprises an information verificationrequest comprising verification information to be verified, and whereinthe response is an affirmative or negative response.

In some aspects, the interface is further configured to provide theaffirmative response to the requesting entity in response to the inquirywhen the result verifies the verification information and provide thenegative response in response to the inquiry when the result does notverify the verification information.

In some aspects, each cluster is further configured to compute an incomevalue for the individual based on the data fragments received from eachof the processor systems. The result verifies the verificationinformation when a difference between the verification information andthe income value is less than or equal to a threshold value. The resultdoes not verify the verification information when the difference isgreater than the threshold value.

In some aspects, the inquiry comprises a request to compute a creditscore for the individual, and wherein the response comprises the creditscore for the individual.

Another aspect of the present disclosure described herein includes amethod of aggregating and processing data. The method comprisesreceiving an inquiry from a requesting entity for computing informationregarding an individual based on pieces of information held by aplurality of entities and communicating an identifier for the individualto a processor system associated with each of the entities based on theinquiry. The method also comprises receiving encrypted data fragmentsfrom one or more of the processor systems, wherein the processor systemsare each configured to generate one or more encrypted data fragmentsbased on processing one or more of the pieces of information held by anentity associated with the respective processor system, and wherein theencrypted data fragments comprise unrecognizable fragments that noindividual processor system can re-assemble to recover the one or morepieces of the information. The methods further comprises performingsecure, multi-party computations based on the data fragments receivedfrom each of the processor systems, generating a result based on thesecure multi-party computations for the individual, communicating theresult to a controller, and generating a response and provide theresponse to the interface for providing to the requesting entity.

In some aspects, the plurality of entities comprise one or more of afinancial institution, a healthcare institution, or a consumer datainstitution.

In some aspects, the method further comprises identifying respectiveidentifiers of the individual for each of the plurality of entitiesbased on the inquiry, and communicating the respective identifiers tothe processor system associated with each of the entities.

In some aspects, the method further comprises receiving the datafragments for further processing in aggregate from the processor systemassociated with each of the entities, wherein each processor system isfurther configured to perform initial computations on individual piecesof information before generating the data fragments.

In some aspects, the method further comprises identifying the initialcomputations performed by the processor system associated with each ofthe entities and the secure, multi-party computations performed by theone or more clusters of computation nodes.

In some aspects, the method further comprises identifying a quantity ofcomputation nodes in the one or more clusters that perform the secure,multi-party computations, wherein the quantity is based on a desiredsecurity level.

In some aspects, the inquiry comprises an information verificationrequest comprising verification information to be verified, and whereinthe response is an affirmative or negative response.

In some aspects, the method further comprises providing the affirmativeresponse to the requesting entity in response to the inquiry when theresult verifies the verification information and providing the negativeresponse in response to the inquiry when the result does not verify theverification information.

In some aspects, the methods further comprising computing an incomevalue for the individual based on the data fragments received from eachof the processor systems, wherein the result verifies the verificationinformation when a difference between the verification information andthe income value is less than or equal to a threshold value and whereinthe result does not verify the verification information when thedifference is greater than the threshold value.

In some aspects, the inquiry comprises a request to compute a creditscore for the individual, and wherein the response comprises the creditscore for the individual.

An additional aspect of the present disclosure described herein includesa method of aggregating and processing data relative to an inquiryregarding an entity. The method comprises receiving identifyinginformation for the entity, identifying one or more identifiers relatedto the entity based on at least the received identifying information,and communicating the one or more identifiers to a client-sideprocessing unit associated with each of a plurality of partnerinstitutions. The methods also comprises, for each client-sideprocessing unit, querying one or more records from a record databasebased on the one or more identifiers, receiving the one or more recordsfrom the record database, processing the received one or more records togenerate data fragments, and for each computation node of a computationgroup, computing combined attributes based on the generated datafragments and generating a response to the inquiry based on the computedcombined attributes.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a diagram of an example service platform implementingsecure multi-party computations across a plurality of entities.

FIG. 2 shows a dataflow diagram of an example verification of incomerequest as processed by the service platform shown in FIG. 1 .

FIG. 3 shows an exemplary architecture and dataflow for the exampleservice platform of FIG. 1 as it processes the verification of incomerequest of FIG. 2 .

FIG. 4 shows an exemplary deployment and security architecture for theservice platform of FIG. 1 .

FIG. 5A shows a first exemplary distributed system that implements theservice platform of FIG. 1 .

FIG. 5B shows a second exemplary distributed system that implements theservice platform of FIG. 1 .

FIG. 6 is a block diagram showing example components of a transactiondata processing system 1000.

DETAILED DESCRIPTION

In the current data and online environment, many parties have increasingconcerns regarding maintaining data privacy and data security. As such,traditional data aggregation models where a data aggregator obtainsnecessary data from multiple data providers and centralizes the data inone place for further processing are facing significant challenges.Furthermore, a party is often hesitant to share data with other partiesfor fear of the other parties gaining market share based on the party'sdata or for fear of the other parties exposing the party to privacy ordata concerns. The systems and methods described herein provide a newuniversal computational framework that is decentralized, secure, privacypreserving, and scalable. The systems and methods achieve the datasecurity and privacy preservation by retaining all data within dataproviders without having the data providers release respective dataoutside of respective local environments in any externallyre-constructible way. The systems and methods achieve the scalability bydecomposing computations as appropriate so computations are practicaland by deferring the most computation and communication intensiveportions to later stages of the computations as much as possible. Forexample, each data provider may perform local analysis and computationsrelevant to the data provided by the data provider and provide theresults of the local analysis and computations. The results of the localanalysis and computations may be shared as data fragments that alone aredifficult or impossible to convert into useful and/or recognizable data.Further details of the processing are provided below.

An industrial problem may involve the aggregation and processing of avariety of data from a number of data sources to identify and distilldeeper insights from the data. In its most general form, the industrialproblem can be basically expressed in the recursive equation, Equation1.0:

outcome(X)=(C_or_M(key₁ ,f ₁,*),C_or_M(key₂ ,f ₂,*),C_or_M(key₃ ,f ₃,*),. . . ,C_or_M(key_(k) ,f _(k),*))   Equation 1.0

Where:

-   -   outcome(X) represents the outcome of the calculation of the set        of data X;    -   ‘*’ can be either the set of data X, or, another C_or_M    -   C_or_M is an operation, which in some embodiments may be either        a ‘Combine’ operation (C), which combines information over the        data indexed by a given key (for example, key₁) or, a ‘Map’        operation (M), where each record in the data is indexed by the        given key. The actual operation in Combine or Map is indicated        by the function name identified in the second argument, f₁. The        identified function in f₁ can be an operation that performs        either ‘Combine’ or ‘Map’.        -   In the case of Combine, the function can be one or more            operations such as calculating a total, a product, an            average, a maximum value, a minimum value, a count, etc. of            the data indexed by the given key.        -   In the case of Map, the function can be one or more            operations that either map the value in the record to            another value through an operation defined in the function,            or, it can be a filter function that determines whether the            keep the value in the record or not, etc.

Any computation framework can be used to calculate the recursiveEquation 1.0. For example, the data management community may use SQL tocalculate Equation 1.0 while the “Big Data” community may use MapReduce. However, such calculation of the Equation 1.0 using typicalexisting computation frameworks may assume or expect that the data, X,utilized to solve the recursive Equation 1.0 is accessible by any singleentity performing the calculation or generally using the computationframework. In such instances, the single entity may operate as anaggregator of data from multiple sources and operate on the aggregateddata in its raw form.

The systems and methods described herein leverage Secure Multi-PartyComputation, “SM PC”, to construct a decentralized computationalenvironment that enables data owners and/or custodians (for example,financial institutions) to utilize their own privately and/or securelyheld client data to create aggregated or derivative data for thoseclients in conjunction with privately and/or securely held client dataof others (for example, other financial institutions) without any partyhaving to physically or digitally share their own private data withanyone else. The described systems and methods provide benefits of SMPCwhile alleviating the participating parties from the burden of the needto host and maintain the sophisticated equipment and mechanisms thatprovide the secure computation capabilities by decentralizing thecomputation environment in a secured cloud environment hosted by a thirdparty. The resulting architecture is flexible in that participatingparties can optionally host one or more parts of the computationenvironment (for example, the SMPC computation environment) should a usecase benefit from such a configuration.

These systems and methods may implement an algorithm that establishesand utilizes a decentralized computational framework that overcomesissues in the above referenced centralized computational framework. Thealgorithm may begin with the recursive Equation 1.0. However, instead ofprocessing the Equation 1.0 with the computational framework, asdescribed above, the algorithm may decompose the Equation 1.0 intoEquation 2.0 by expanding the data X into data held by individual dataproviders, such as may be represented in the below equation according tosome embodiments:

outcome(X)=(C_or_M(key₁ ,f ₁,*),C_or_M(key₂ ,f ₂,*),C_or_M(key₃ ,f ₃,*),. . . ,C_or_M(key_(k) ,f _(k),*))=C_or_M(key₁ ,f ₁ ,C_or_M(key₂ ,f ₂ ,X_(B) ₁ ),C_or_M(key₃ ,f ₃ ,X _(B) ₁ ), . . . ,C_or_M(key_(k) ,f _(k) ,X_(B) ₁ ),C_or_M(key₂ ,f ₂ ,X _(B) ₂ ),C_or_M(key₃ ,f ₃ ,X _(B) ₂ ), . .. ,C_or_M(key_(k) ,f _(k) ,X _(B) ₁ ), . . . ,C_or_M(key₂ ,f ₂ ,X _(B)_(n) ),C_or_M(key₃ ,f ₃ ,X _(B) _(n) ), . . . ,C_or_M(key_(k) ,f _(k) ,X_(B) _(n) ))   Equation 2.0

Where, X=(X_(B) ₁ , . . . , X_(B) _(n) ) and X_(B) _(i) are data fromparty B_(i)

The algorithm then determines which of the operations in the Equation2.0 are performed locally and which are performed by the computationalframework. For example, portions of the C_or_M function will beperformed locally (for example, at the individual data providerproviding the data, such a portion identified as C_or_M_(local)) whileother portions of the C_or_M function are performed by a decentralizedsystem, for example an SMCP system (such a portion identified asC_or_M_(secure)). Starting from the last C_or_M function in therecursion and working from the bottom up, the algorithm may determinewhether the key in the C_or_M function is contained in each dataprovider and, if so, the C_or_M_(local) function will be executedlocally. If the C_or_M function is a Map function, the C_or_M_(local)function will be executed locally; if the C_or_M function is a Combinefunction, then the algorithm determines whether the C_or_M function isdecomposable or not. If the C_or_M function is decomposable (forexample, f(X_(B) ₁ , . . . , X_(B) _(n) ))=g(h(X_(B) ₁ ), . . . ,h(X_(B) _(n) )), where the function is, for example, a sum, product,minimum, maximum, and so forth), then the algorithm limits a key rangein C_or_M(key, . . . ) with a provider ID as C_or_M_(local)(provider_id, key, . . . ) so that h(*) is executed locally with theprovider and g(*), which corresponds to a C_or_M_(secure) operation, isexecuted securely to combine the results. However, if the C_or_Mfunction is not decomposable, but the computation can be approximated(for example, median of median can be an approximation of the truemedian), then the C_or_M function is replaced with an approximationfunction that can execute C_or_M_(local) locally if desired.

For the remainder of the C_or_M function (for example, theC_or_M_(secure)), the corresponding operations and/or computations maybe performed in a decentralized manner, for example via the SMCP system.Thus, the C_or_M_(secure) computations are performed by one or morecomputations nodes of the SMCP system. Such computations of Equation 2.0may be scaled. For example, scaling these computations may comprise thealgorithm determining the desired security level (for example, no morethan k computation nodes can be compromised at the same time). Once thesecurity level is determined, the algorithm may determine an appropriatenumber of SMPC computation nodes per algorithm, such as n=(k*2+1). Forexample, the number of computation nodes k used may be a function of thenumber of nodes of concern that a bad party may compromise or to whichthe bad party gains access. Thus, so for the SMPC system to maintainsecurity when 5 nodes are compromised, the SMPC system may use k=11nodes so that such the bad party will not have access to a majority ofthe SMPC computation nodes.

For each of the data providers, a local agent associated with the dataprovider (for example, part of a processing system of the data provider)fragments the data elements to be shared with the SMPC (for example,computed using the C_or_M_(secure)). This process may utilize a “securedata adaptor layer”. In some embodiments, the local agent fragments thedata elements to be shared into n data fragments that, if assembledappropriately, form the data elements to be shared. Each of the n datafragments is distributed to each of the n SMPC computation nodes thatperform the remaining C_or_M_(secure) operations or functions in asecure manner (for example, as described with respect to the SMPC systemdescribed herein). Illustrations of this algorithm being applied areprovided below.

1. Exemplary Term Descriptions

To facilitate an understanding of the systems and methods discussedherein, a number of terms are described below. The terms describedbelow, as well as other terms used herein, should be construed toinclude the provided descriptions, the ordinary and customary meaning ofthe terms, and/or any other implied meaning for the respective terms.Thus, the descriptions below do not limit the meaning of these terms,but only provide exemplary definitions.

Data Store: Includes any computer readable storage medium and/or device(or collection of data storage mediums and/or devices). Examples of datastores include, but are not limited to, optical disks (for example,CD-ROM, DVD-ROM, and so forth), magnetic disks (for example, hard disks,floppy disks, and so forth), memory circuits (for example, solid statedrives, random-access memory (“RAM”), and so forth), and/or the like.Another example of a data store is a hosted storage environment thatincludes a collection of physical data storage devices that may beremotely accessible and may be rapidly provisioned as needed (commonlyreferred to as “cloud” storage).

Database: Includes any data structure (and/or combinations of multipledata structures) for storing and/or organizing data, including, but notlimited to, relational databases (for example, Oracle databases, MySQLdatabases, and so forth), non-relational databases (for example, NoSQLdatabases, and so forth), in-memory databases, spreadsheets, as commaseparated values (“CSV”) files, eXtendible markup language (“XML”)files, TeXT (“TXT”) files, flat files, spreadsheet files, and/or anyother widely used or proprietary format for data storage. Databases aretypically stored in one or more data stores. Accordingly, each databasereferred to herein (for example, in the description herein and/or thefigures of the present application) is to be understood as being storedin one or more data stores.

Database Record and/or Record: Includes one or more related data itemsstored in a database. The one or more related data items making up arecord may be related in the database by a common key value and/orcommon index value, for example.

Event Notification, Notification, and/or Alert: Includes electronic anynotification sent from one computer system to one or more othercomputing systems. For example, a notification may indicate a new recordset or changes to one or more records of interest. Notifications mayinclude information regarding the record change of interest, and mayindicate, for example, to a user, an updated view of the data records.Notifications may be transmitted electronically, and may causeactivation of one or more processes, as described herein.

Transaction data (also referred to as event data) may generally refer,in some embodiments, to data associated with any event, such as aninteraction by a user device with a server, website, database, and/orother online data owned by or under control of a requesting entity, suchas a server controlled by a third party, such as a merchant. Transactiondata may include merchant name, merchant location, merchant category,transaction dollar amount, transaction date, transaction channel (e.g.,physical point of sale, Internet, etc.) and/or an indicator as towhether or not the physical payment card (e.g., credit card or debitcard) was present for a transaction. Transaction data structures mayinclude, for example, specific transactions on one or more credit cardsof a user, such as the detailed transaction data that is available oncredit card statements. Transaction data may also includetransaction-level debit information, such as regarding debit card orchecking account transactions. The transaction data may be obtained fromvarious sources, such as from credit issuers (e.g., financialinstitutions that issue credit cards), transaction processors (e.g.,entities that process credit card swipes at points-of-sale), transactionaggregators, merchant retailers, and/or any other source. Transactiondata may also include non-financial exchanges, such as login activity,Internet search history, Internet browsing history, posts to a socialmedia platform, or other interactions between communication devices. Insome implementations, the users may be machines interacting with eachother (e.g., machine-to-machine communications). Transaction data may bepresented in raw form. Raw transaction data generally refers totransaction data as received by the transaction processing system from athird party transaction data provider. Transaction data may becompressed. Compressed transaction data may refer to transaction datathat may be stored and/or transmitted using fewer resources than when inraw form. Compressed transaction data need not be “uncompressible.”Compressed transaction data preferably retains certain identifyingcharacteristics of the user associated with the transaction data such asbehavior patterns (e.g., spend patterns), data cluster affinity, or thelike.

User: depending on the context, may refer to a person, such as anindividual, consumer, or customer, and/or may refer to an entity thatprovides input to the system and/or an entity that utilizes a device toreceive the event notification, notification or alert (for example, auser who is interested in receiving notifications upon the occurrence ofthe newly generated record set or changes to records of interest). Thus,in the first context, the terms “user,” “individual,” “consumer,” and“customer” should be interpreted to include single persons, as well asgroups of users, such as, for example, married couples or domesticpartners, organizations, groups, and business entities. Additionally,the terms may be used interchangeably. In some embodiments, the termsrefer to a computing device of a user rather than, or in addition to, anactual human operator of the computing device.

An entity may generally refer to one party involved in a transaction. Insome implementations, an entity may be a merchant or other provider ofgoods or services to one or more users, a financial institution, a bank,a credit card company, an individual, a lender, or a company ororganization of some other type.

A model may generally refer to a machine learning construct which may beused by the transaction processing system to automatically generate aresult or outcome. A model may be trained. Training a model generallyrefers to an automated machine learning process to generate the modelthat accepts an input and provides a result or outcome as an output. Amodel may be represented as a data structure that identifies, for agiven value, one or more correlated values. For example, a datastructure may include data indicating one or more categories. In suchimplementations, the model may be indexed to provide efficient look upand retrieval of category values. In other embodiments, a model may bedeveloped based on statistical or mathematical properties and/ordefinitions implemented in executable code without necessarily employingmachine learning.

A vector encompasses a data structure that can be expressed as an arrayof values where each value has an assigned position that is associatedwith another predetermined value. For example, an entity vector will bediscussed below. A single entity vector may be used represent the numberof transaction for a number of users within a given merchant. Each entryin the entity vector represents the count while the position within theentity vector may be used to identify the user with whom the count isassociated. In some implementations, a vector may be a useful way tohide the identity of a user but still provide meaningful analysis oftheir transaction data. In the case of entity vectors, as long as thesystem maintains a consistent position for information related to a userwithin the vectors including user data, analysis without identifying auser can be performed using positional information within the vectors.Other vectors may be implemented wherein the entries are associated withtransaction categories or other classes of transaction data.

Machine learning generally refers to automated processes by whichreceived data is analyzed to generate and/or update one or more models.Machine learning may include artificial intelligence such as neuralnetworks, genetic algorithms, clustering, or the like. Machine learningmay be performed using a training set of data. The training data may beused to generate the model that best characterizes a feature of interestusing the training data. In some implementations, the class of featuresmay be identified before training. In such instances, the model may betrained to provide outputs most closely resembling the target class offeatures. In some implementations, no prior knowledge may be availablefor training the data. In such instances, the model may discover newrelationships for the provided training data. Such relationships mayinclude similarities between data elements such as transactions ortransaction categories as will be described in further detail below.

Requesting Entity generally refers to an entity, such as a business, anon-profit organization, an educational institution, an automobiledealer, a vehicle manufacture, a financial institution, etc., thatrequest information and/or services from one or more of the systemsdiscussed herein. For example, a requesting entity may comprise anautomobile dealership that provides customer information for monitoringof events that may be indicative of opportunities to enhancerelationships with particular customers, and the requesting entity mayreceive notifications of when such events occur so that appropriateaction can be timely taken.

A recommendation or result encompasses information identified that maybe of interest to a user having a particular set of features. Forexample, a recommendation or result may be developed for a user based ona collection of transaction or similar data associated with the user andthrough application of a machine learning process comparing thattransaction data with third-party transaction data (e.g., transactiondata of a plurality of other users). A recommendation may be based on adetermined entity and may include other merchants or vendors related toor similar to the determined merchant. In some implementations, therecommendation may include recommendation content. The recommendationcontent may be text, pictures, multimedia, sound, or some combinationthereof. In some implementations, the recommendation may include arecommendation strength. The strength may indicate a confidence level inthe recommendation by the computing system. As such, the strength may beincluded to allow systems receiving the recommendation to decide howmuch credence to give the recommendation.

A message encompasses a wide variety of formats for communicating (e.g.,transmitting or receiving) information. A message may include a machinereadable aggregation of information such as an XML document, fixed fieldmessage, comma separated message, or the like. A message may, in someimplementations, include a signal utilized to transmit one or morerepresentations of the information. While recited in the singular, amessage may be composed, transmitted, stored, received, etc. in multipleparts.

The terms determine or determining encompass a wide variety of actions.For example, “determining” may include calculating, computing,processing, deriving, looking up (e.g., looking up in a table, adatabase or another data structure), ascertaining and the like. Also,“determining” may include receiving (e.g., receiving information),accessing (e.g., accessing data in a memory) and the like. Also,“determining” may include resolving, selecting, choosing, establishing,and the like.

The term selectively or selective may encompass a wide variety ofactions. For example, a “selective” process may include determining oneoption from multiple options. A “selective” process may include one ormore of: dynamically determined inputs, preconfigured inputs, oruser-initiated inputs for making the determination. In someimplementations, an n-input switch may be included to provide selectivefunctionality where n is the number of inputs used to make theselection.

The terms provide or providing encompass a wide variety of actions. Forexample, “providing” may include storing a value in a location forsubsequent retrieval, transmitting a value directly to a recipient,transmitting or storing a reference to a value, and the like.“Providing” may also include encoding, decoding, encrypting, decrypting,validating, verifying, and the like.

A user interface (also referred to as an interactive user interface, agraphical user interface or a UI) may refer to a web-based interfaceincluding data fields for receiving input signals or providingelectronic information and/or for providing information to the user inresponse to any received input signals. A UI may be implemented in wholeor in part using technologies such as HTML, Flash, Java, .net, webservices, and RSS. In some implementations, a UI may be included in astand-alone client (for example, thick client, fat client) configured tocommunicate (e.g., send or receive data) in accordance with one or moreof the aspects described.

2. Example Service Platform

As described above, decentralized computation frameworks may enablesecure computation of information from multiple parties withoutrequiring one or more of the parties to share data with any otherparties that have data relevant to the computation. Such a decentralizedcomputation framework may be implemented by a service platform as shownin FIG. 1 . FIG. 1 shows a diagram of an example service platform 100implementing secure multi-party computations across a plurality ofentities (for example, the data providers as introduced above). Asshown, the service platform 100 includes a requesting device 102 (forexample, connected to the service platform 100 via the Internet orsimilar connection), which may be operated by, owned by or otherwiseassociated with any entity that is requesting one or more pieces ofinformation or results based on data held by one or more other entities(for example, the institutions 106 a-106 g). As described herein, therequesting device 102 may correspond to a requesting individual, arequesting entity, a requesting institution, and so forth, where therequesting device 102 is used by the requesting individual, requestingentity, requesting institution, and so forth, to access the serviceplatform 100. In some embodiments, the institutions 106 a-106 g comprisebanks, financial institutions, and so forth. In some embodiments, theinstitutions 106 a-106 g have information useful for performing aservice requested by or from the requesting device 102. The serviceplatform 100 also includes computation nodes 104 a-104 d, which performvarious computations based on information provided by the institutions106 a-106 g (each with their own computing systems or servers). Theservice platform 100 includes the institutions 106 a-106 g, which eachmay be custodians or holders of private data regarding their clients,etc., and each of which may not wish to share their private data withany other entity. The service platform 100 also includes a network 108,which generally connects each of the requesting device, the computationnodes, and the institutions. In some embodiments, the computation nodes104 a-104 d coupled via the network 108 comprise an SMPC platform 110.

The requesting device 102 may receive a request, for example, ofverifying an income for a client A. Verifying the client A's income mayinvolve obtaining information from each of the institutions 106 a, 106b, 106 f, and 106 g. However, each of these institutions 106 a, 106 b,106 f, and 106 g may refuse or otherwise be hesitant to sharing itsinformation regarding the client A with other institutions or with therequesting device 102.

Although each of the computation nodes 104 a-104 d may be shown inproximity to one or more particular institutions 106, the computationnodes 104 a-104 d may not be limited to working with only thoseparticular institutions and may be able to work with any institutions106 a-106 g connected to the network 108. In some embodiments, as shownin FIG. 1 , one or more of the computation nodes 104 is in communicationwith its own (for example, local) network system and each of thecomputation nodes 104 are in communication with all of the othercomputation nodes 104. Accordingly, the computation nodes 104 a-104 dcan exchange information and perform shared computing of data (forexample, results from computations) without necessarily sharing theinput data used in the computations. More details regarding thecomputation nodes 104 a-104 d and the network 108 are provided below.

The institutions 106 a-106 g may each comprise a local computing network(not shown in FIG. 1 ). The institutions 106 a-106 g may use theirrespective local computing networks to perform computations and/or otherservices on their own data that does not require or utilize data fromanother institution 106. By performing local processing at theinstitutions 106 a-106 g, the computations nodes 104 a-104 d may bereserved for processing of information from multiple of the institutions106, thereby improving efficiencies of the service platform 100. Theinstitutions 106 a-106 g are each connected to the SMPC platformcomprising the computation nodes 104 a-104 d via a secure connection, asshown in FIG. 1 between each database of the institutions 106 a-106 gand the network 108. In some embodiments, the institutions 106 a-106 gare configured to provide one or more pieces of information or outcomesof processing the one or more of the pieces of information as encrypteddata fragments to the computation nodes 104 a-104 d such that theencrypted data fragments comprises unrecognizable fragments that nocomputation node 104 can re-assemble to recover the one or more piecesof the information or the outcomes of processing the one or more piecesof information.

Two example use cases are provided below to illustrate operation of thealgorithm described above with reference to Equation 2.0. The firstexample is a decentralized income verification/estimation solution. Thesecond example is a decentralized credit scoring solution. Additionaldetails for each solution are provided below.

Example Income Verification/Estimation Use Case

In income verification/estimation operations, the incomeverification/estimation may utilize review and verification of physicalor electronic statements regarding each corresponding financial record(for example, bank records). However, such efforts may be time consumingand difficult to complete with respect to verification processes givenprivacy and data security concerns. Operations where financialinstitutions share data related to income verification freely with acentralized aggregator (for example, another financial institution or anunrelated third party) may also be negatively viewed given the datasecurity and privacy concerns. Instead, as noted above, decentralizedoperations using the framework and/or service platform 100 of FIG. 1 mayinstead be used to implement secure processing of the incomeverification/estimation data from the multiple financial institutions.

For example, following the algorithm described with relation toEquations 1.0 and 2.0 above, the algorithm formulates a centralizedversion of an income verification/estimation operation as in Equation3.0 below. According to Equation 3.0, each transaction in the consumer'sbanking accounts is processed and mapped to different categories such asregular income, expense, transfer, interest, and so forth. Irrelevanttransactions may be filtered out and not used in further calculations toreduce computation overhead. For example, the Combine function a₁identifies a transfer of funds operation by finding pairs oftransactions of the same amount but opposite signs in differentaccounts. Then, a Combine function a₂ tallies up a total amount from allof the relevant transactions to derive the consumer's income.

consumer income=Combine(‘consumer’,a₂,Combine(‘consumer,date,txn’_(type) ,a ₁,Map(‘txn’,l ₃,Map(‘txn’,l₂,Map(‘txn’,l ₁ ,X))))   Equation 3.0

As described above, the algorithm decomposes the centralized solution togenerate a decentralized solution by allowing for the data to be fromeach of the data providers, as noted in Equation 4.0 below.

$\begin{matrix}{{{consumer}{income}} = {{Combine}\left( {{‘{consumer}’},a_{2},{{Combine}\begin{pmatrix}{{‘{{consumer},{date},{txn\_ type}}’},a_{2},} \\{{Map}\left( {{‘{txn}’},l_{3},{{Map}\left( {{‘{txn}’},l_{2},{{Map}\left( {{‘{txn}’},l_{1},X_{B_{1}}} \right)},} \right.}} \right.} \\{{Map}\left( {{‘{txn}’},l_{3},{{Map}\left( {{‘{txn}’},l_{2},{{Map}\left( {{‘{txn}’},l_{1},X_{B_{2}}} \right)},} \right.}} \right.} \\{{\ldots\ldots},} \\{{Map}\left( {{‘{txn}’},l_{3},{{Map}\left( {{‘{txn}’},l_{2},{{Map}\left( {{‘{txn}’},l_{1},X_{B_{k}}} \right)}} \right.}} \right.}\end{pmatrix}}} \right.}} & {{Equation}4.}\end{matrix}$

The algorithm then identifies the C_or_M_(local) and C_or_M_(secure)operations for the Equation 4.0. For example, the algorithm identifiesthe Map functions in Equation 4.0 to be executable locally in the dataproviders. Furthermore, the Combine operation or function Combine(‘date,txn_type’, a₁, . . . ) of Equation 4.0 that cancels out the transferpairs may be decomposable and may be executed locally first within eachdata provider. A second Combine function a₁, may be added as a parentfunction that will be executed in the SMPC environment, for example tosum all transactions, etc., to verify/estimate the total income for theconsumer. Thus, the second, or parent Combine function a₁, may be theC_or_M_(secure) operation. Thus, any parent Combine functions will beexecuted in the SMPC environment. A resultant equation is shown inEquation 5.0 below.

$\begin{matrix}{{{consumer}{income}} = {{Combine}_{secure}\left( {{‘{consumer}’},a_{2},{{Combine}_{secure}\left( {{‘{{consumer},{date},{txn\_ type}}’},a_{1}^{\prime},\begin{pmatrix}{{Combine}_{local}\left( {{‘{B_{1},{consumer},{date},{txn\_ type}}’},a_{1},{{Map}_{local}\left( {{‘{txn}’},l_{3},{{Map}_{local}\left( {{‘{txn}’},l_{2},{{Map}_{local}\left( {{‘{txn}’},l_{1},X_{B_{1}}} \right)},} \right.}} \right.}} \right.} \\{{Combine}_{local}\left( {{‘{B_{2},{consumer},{date},{txn\_ type}}’},a_{1},{{Map}_{local}\left( {{‘{txn}’},l_{3},{{Map}_{local}\left( {{‘{txn}’},l_{2},{{Map}_{local}\left( {{‘{txn}’},l_{1},X_{B_{2}}} \right)},} \right.}} \right.}} \right.} \\{{\ldots\ldots},} \\{{Combine}_{local}\left( {{‘{B_{n},{consumer},{date},{txn\_ type}}’},a_{1},{{Map}_{local}\left( {{‘{txn}’},l_{3},{{Map}_{local}\left( {{‘{txn}’},l_{2},{{Map}_{local}\left( {{‘{txn}’},l_{1},X_{B_{n}}} \right)},} \right.}} \right.}} \right.}\end{pmatrix}} \right.}} \right.}} & {{Equation}5.}\end{matrix}$

This example is described in more detail with respect to FIG. 2 .

FIG. 2 shows a dataflow diagram 200 of an example verification of incomerequest as processed by the service platform 100 shown in FIG. 1 . Thedataflow 200 shown in FIG. 2 begins with a requesting institution 202,which may request a service that involves analyzing “private”information from various sources (for example, information that thesources do not want to share with each other). In some embodiments, therequested service comprises one or more of an income, employment, andassets verification, fraud prevention, cashflow, and revenue, amongothers. As noted above, the requesting institution 202 may correspond toor use the requesting device 102 to access the service platform 100.Additionally, such service platforms may be used in various industries,including healthcare or medical record aggregation and analysis,advertising, utilities, insurance, financial services, data analysis,credit monitoring, credit data bureaus or alternative data bureaus,location-based risk assessment, insurance fraud, attributing purchasesto advertisements, and many others.

As shown in FIG. 2 , the requesting institution 202 transmits an incomeverification request or inquiry 203 to an income verification service204. The request 203 may include an identifier if an individual orentity (for example, when the request is for an income verification or acredit score or the like). The income verification service 204communicates the received identifier with an identifier database (ID DB)206 and with one or more banks 208. The ID DB 206 may determine whetherthere are aliases or other identifiers associated with the identifierreceived in the request 203. For example, if the request 203 includesthe identifier “Bob Dixon” along with some personal information for BobDixon (for example, a social security number (SSN) or address or date ofbirth (DOB), and so forth). The ID DB 206 may use the identifier BobDixon to identify additional aliases or identifiers that correspond tothe Bob Dixon identifier. For example, the ID DB 206 returns identifiersRobert Dixon and Rob Dixon that share one or more of the SSN, address,or DOB with the Bob Dixon identifier in records for the ID DB 206. TheID DB 206 may provide each of these identifiers to the incomeverification service 204. The income verification service 204 maycommunicate with each of the institutions 106 a-106 g and provide themwith the identifiers received from the ID DB 206 and the requestinginstitution 102. FIG. 2 shows only one institution 106 a, for example abank environment.

The institution 106 a receives the identifiers and accesses the relevantinformation and/or records (hereinafter information) from a localdatabase (for example, the DDA transactions 208) of the institution 106a. In some embodiments, the institution 106 a performs any relevant andlocal processing on the relevant information from the DDA transactions208 before providing the corresponding information to the computationsnodes 104 a-104 d of the SMPC platform via the network 108. The localprocessing may be completed by a processor 212. In some embodiments, theinstitution 106 a stores details of the request, the local processing,and the provided information in a local audit log 210. The institution106 a may provide the corresponding information as fragmented attributes212 to the SMPC platform. The fragmented attributes may merely comprisethe bare data needed in conjunction with data from other institutions106 without any identifying information or details of the fragments.

In some embodiments, the institution 106 a calculates particulars of theincome verification based on only its own records (for example, theinformation from the DDA transactions 210) and then fragments thecalculated particulars for communication to the SMPC platform. Due tothe fragmentation, as will be described in further detail herein, noneof the private information of the institution 106 a or the results ofthe local computations performed by the institution 106 a is likely tobe determined based on receiving some of the fragments, and privacybetween the institutions 106 is maintained. The SMPC platform then usesthe fragmented local information from multiple banks to determine, basedon the private information received from a plurality of banks, resultsto the income verification inquiry and reports the results back to theincome verification service as inquiry results 220, which provides theresults to the requesting institution 202 as a response 222. Details ofthis process are provided below.

For example, the dataflow 200 of the service platform 100 may be appliedto verifying/estimating Bob Dixon's income using, for example, BobDixon's account history. In some embodiments, the account history mayinclude records from a number of financial institutions. The records mayinclude transaction descriptions for transactions that involve or occurat the respective financial institution. At a high level, performing theincome verification/estimation by the service platform 100 may involveanalyzing each transaction for each financial institution. Given anumber of transactions that an average consumer has in their financialtransactions, having the computation nodes 104 a-104 d of the SMPCplatform analyze each of these transactions would be resource and timeintensive. Thus, some processing of these transactions may be offloadedto each financial institution such that each financial institutionperforms some “preprocessing” of its financial transactions and records.

Different approaches may be used for verifying/estimating income for aconsumer. One example approach may assume that transaction informationreceived in real-time includes a transaction amount and a transactiondate and a limited number of pay-stream periods exist (for example,weekly, biweekly, monthly, quarterly, annually, and so forth). As such,each identified transaction is “fit” into one of the existingpay-streams. Another example approach may assume that transactionsinclude descriptions that may group such transactions by thedescriptions into different streams. Then, date differences between thetransactions in each stream are analyzed to determine periodic streams.

In some embodiments, much of the processing for incomeverification/estimation may be performed locally in each institution106. For example, deposits from pay checks and payments of bills may beeasily identified and used to determine an income of Bob Dixon by asingle institution 106. For example, the processor 212 of theinstitution 106 a may locally calculate an income of a particularcustomer based on its local transactions without involving anotherinstitution 106. However, some transactions (such as transfers betweenaccounts or institutions 106) may require review of records and/or datafrom the institutions 106. For example, the incomeverification/estimation system should not interpret a transactionregarding transfer of funds between accounts of Bob Dixon as income forBob Dixon. In order to reduce resource demand and to improveefficiencies and maintain data privacy and security, the processor 212of each institution 106 may perform a bulk of the transaction/recordprocessing in the income verification/estimation. Using the SMPCplatform to cancel transactions that indicate transfers between accountswhile maintaining data security and privacy may have a highercomputational cost as compared to when data security and privacyconcerns are paramount. However, using the SMPC platform may maintainhigher accuracy.

In some embodiments, the processor 212 of the institution 106 a mayperform pre-processing of the transactions and records local to theinstitution 106 a (for example, the transactions and records stored inthe DDA transactions 208). The pre-processing may comprise anyprocessing that is performed locally (for example, the C_or_M_(local)operations or transactions described above). In some embodiments, thepre-processing by the processor 212 comprises discarding or ignoringtransactions involving amounts of less than a threshold value, forexample $100. Such an exclusion of transactions involving amounts thatdo not meet the threshold value may further reduce calculation overheadby effectively ignoring most of daily activities, which involvesmall-amount transactions. Although such pre-processing may result inlosing some relevant transactions or records (for example, because sometransactions or records involving amounts less than the threshold valuecan be qualified as income), the pre-processing by the processor 212likely will not greatly impact a total income for Bob Dixon. In someembodiments, such exclusion of transactions especially saves calculationoverhead when SMPC transaction cancelling is involved.

The processor 212 may further pre-process the transactions and recordslocal to the institution 106 a by discarding or ignoring incomingtransactions (for example, deposits) from invalid or improper accounttypes. For example, an invalid or improper account type may comprise anyaccount type that is not a savings, checking account, an investmentaccount, and the like. For example, if an “incoming” transaction occursin a credit card type account, the transaction may be assumed to be apayment transaction and not an income transaction. Therefore, suchtransactions from the invalid or improper account types can be ignoredfor the purposes of verifying/estimating income. The pre-processing mayfurther cancel intra-institution transfers. For example, when theinstitution 106 a includes records for a savings account and a checkingaccount for Bob Dixon, transfers of funds between the savings andchecking accounts may be ignored as not involving income, but ratherfunds transfers. Similarly, if the institution 106 a includes a creditcard account and a checking/savings account, credits from the creditcard account to the checking/savings account or payments from thechecking/savings account to the credit card account may be ignored asnot being income.

As part of the pre-processing, the processor 212 may apply a positivemodel or a negative model to predict whether a transaction in the DDAtransactions 208 of the institution 106 a is part of aninter-institution transfer pair. For example, the positive modelcomprises a model that takes the input of a transaction whose amount is“positive” (meaning that the transaction is an incoming transaction).The processor 212 then tries to predict whether the incoming transactionbelongs to an inter-bank transfer pair or not. The processor 212 mayanalyze various description fields associated with the incomingtransaction that are available, for example a transaction descriptionfield, a transaction amount field, a transaction time field, and atransaction account field. In some embodiments, one or more machinelearning algorithms can be applied to efficiently and accurately performsuch prediction, which may include implementation that relies on abag-of-words model. The negative model is similar to the positive model,in that the negative model also aims to predict if a transaction belongsto an inter-bank transfer pair or not, but it uses “negative”transactions instead of “positive” transactions. The negative model canalso be used to improve accuracy of the transfer identificationpre-processing. However, while the positive model tries to predictwhether a transaction is part of a transfer pair in general, thenegative model aims to predict that whether a transaction is part of atransfer pair that is not predicted by positive model. In other words,for the transactions already “caught” or identified as part of atransfer pair by the positive model, the negative model need not catchor identify that transaction. This also implies that the negative modelperformance may depend on a cutoff of the positive model.

Once the processor 212 completes the pre-processing forverifying/estimating Bob Dixon's income, the processor 212 may generatethe fragmented attributes 214 for distribution to the computation nodes104 a-104 d for processing by the service platform. In some embodiments,the fragmented attributes 214 may comprise the transactions that may bepart of a transfer pair and/or a verified/estimated income as determinedby that institution 106 a.

The service platform 100 may perform the processing of theC_or_M_(secure) functions or operations. For example, the computationnodes 104 a-104 d may perform privacy-preserving comparisons to identifyany transfers between institutions that would not be interpreted asincome. Such privacy-preserving comparison in the SMPC platform may bevery expensive (for example, time intensive, computation intensive, andso forth). The computation nodes 104 a-104 d of the SMPC platform mayperform calculations in near real-time that sacrifice some aspects ofdata privacy and security to improve resource costs.

For example, the computation nodes 104 a-104 d may use the followingparameters to perform the privacy-preserving comparison:

-   -   A number of transactions, for example, that the processor 212        (for example, the agent of the institution 106 a) submitted to        the computation nodes 104 a-104 d as data fragments after the        pre-processing described above per day from each institution        106.    -   Whether each transaction is positive or negative.    -   A residue of a transaction amount divided by some small prime        number (for example 2)

The computation nodes 104 a-104 d may calculate whether a pair ofpositive and negative transactions cancels out (for example, sum to 0).Data privacy and security can be maintained by calculating the sum ofthe positive and negative transactions of the potential transfer pairwithout revealing the sum. For example, the computation nodes 104 a-104d may multiple the sum by some positive random number. A sign of theresult of the sum multiplied by the positive random number may be thesame as the sum. If the revealed result of the multiplication is zero,then the computation nodes 104 a-104 d can determine that the pair oftransactions is likely a transfer pair and indicate as such (and excludethe pair of transactions from the income verification/estimation). Thismaintains data privacy and security because the sum of the transactionsis unknown since the random number used is unknown. In some embodiments,the range of the random number may have a threshold minimum number.

In some embodiments, the processor 212 may perform one or moreadditional operations to further improve processing overhead of thecomputation nodes 104 a-104 d of the service platform 100. For example,the processor 212 may separate the data provided to the computationnodes 104 a-104 d. For example, for the potential transfer pairs, theprocessor 212 may separate positive transactions from negativetransactions. Alternatively, or additionally, the processor 212 mayseparate transactions based on the corresponding residue. Theseimprovements, however, come at the sacrifice of some aspect of dataprivacy. But they can sometimes still be implemented when the minorsacrifice (for example, revealing the amount of income is odd or even)does not pose a threat to leaking more sensitive information, whilebringing a huge reduction on the computation overhead,

FIG. 3 shows an exemplary architecture for the example service platformof FIG. 1 as it processes the verification of income request of FIG. 2 .The exemplary architecture includes three major components: the service302, the partners 304A-304C, and the computation environment 306. FIG. 3shows data and/or other communications that flow between the variousmajor components. The service 302 may correspond to the example incomeverification service 204 of FIG. 2 . The partners 304 may correspond tothe institutions 106. The computation environment 306 may correspond tothe computation nodes 104 and the network 108 of FIGS. 1 and 2 .

2.1. Service

The service 302 serves as an entry point of the architecture and theservice platform of FIG. 1 . For example, the service 302 receives aservice request from a requesting entity via a web interface orapplication. Accordingly, the service 302 acts as a central controllerthat receives an inquiry (for example, in the form of personallyidentifiable information, “PII”) from an inquiring entity (for example,a particular client, institutional client, or the requesting institution202). The inquiry may be received via a human-to-machine interface (forexample, a server-side application or similar interface) ormachine-to-machine interfaces (for example, from a particular financialinstitution, such as via an Application Programming Interface, “API”).The service 302 may resolve the identity belonging to the PII based onvarious processing and/or verification steps, for example via anidentity resolution service 303. In some embodiments, the identityresolution service 303 may comprise the ID DB 206 or a similar service.For example, the identify resolution service 303 may identify allidentifiers that correspond with the PII received with the inquiry,where a different identifier may be assigned to the individual fordifferent partners, in some embodiments. The resolved identity maycomprise a hashed identifier, “ID”, that is then provided to eachpartner 304, initiating the service (for example, income identificationand calculation) at the partners 304A-304C. Lastly, the service 302collects final results from the computation environment 306 and returnsthe results back to the inquirer.

The service 302 may include four components, in some embodiments:

-   -   1) A server-side program or application (which may be referred        to as a webapp or web app or interface #1) for interaction with        a browser, client-side application or similar interface;    -   2) An orchestrator;    -   3) An audit trail database; and    -   4) A proxy/API for requests (or interface #2).

The server-side application may authentic a user (for example, theinquiring entity) according to one or more authentication methods beforeallowing the inquiring entity to make a request of the service platform.In some embodiments, the user authentication is managed by a single signon “SSO” or similar service.

The orchestrator works with each partner 304 as well as thedecentralized computing environment 306 to fulfill any request orinquiry made of the service platform by the inquiring entity. Theorchestrator may also receive results of computations by the computingenvironment and relays those results to the inquiring entity via one ormore of the proxy, the server-side application, and/or similarinterfaces.

The audit trail database records all interactions with the service 302,including failed sign on attempts and/or completed inquiries, along withidentities of the inquiring entity. In some embodiments, one or moreblockchain services may be used for auditing and logging.

The proxy/API allows third-party applications or interfaces to integratewith the service 302 to provide inquiries and/or receive results frominquiries. For example, the proxy/API allows for integration withsystems from entities in the mortgage and/or other loan underwritingindustries.

In some embodiments, the service 302 may correspond to a serviceprovided by an entity (for example, an income verification service, acredit scoring service, and so forth). As such, the service 302 may beprovided by a number of components, for example the components shown inthe service 302 in FIG. 3 . Alternatively, or additionally, the service302 may correspond to a server system associated with an operator of theservice 302. For example, the service 302 may be provided by anorganization and correspond to the server system of the organization,where the components shown in the service 302 correspond to differentmodules or components in the server system of the organization.

The service 302 may be a “dynamic” service or a “static” service. As thedynamic service, the service 302 may receive different inquiries fromand provide different responses to a requesting entity 202. For example,the dynamic service 302 may receive and process one or more of theincome verification request and the credit score request, among others,as described herein. As such, the dynamic service 302 may dynamicallychange the processing performed by the computation environment 306. Forexample, the dynamic service 302 provides different secure operations tothe computation environment 306 for processing by the correspondingcomputation nodes 104. The dynamic service 302 may instruct which securemapping or computation operations the computation environment 306 (andtherefore, the computations nodes 104) are to perform as part of theSMPC platform as compared to which operations will be performed by thepartners 304 locally.

In some embodiments, the dynamic system 302 updates the operations to beperformed by the computation environment 306 dynamically, for examplevia a push update or similar notification. In some embodiments, thedynamic system 302 provides the appropriate operation(s) to thecomputation environment 306 via the update, where the dynamic system 302determines and provides the appropriate operation(s) based on theinquiry received from the requesting device 102. For example, thedynamic system 302 may comprise a library of operations associated withdifferent inquiries and provide the appropriate operation(s) to thecomputation environment 306 based on the inquiry. In some embodiments,the dynamic system 302 provides an indicator identifying the appropriateoperation(s) for the computation environment 306, which the computationenvironment 306 uses to lookup the appropriate operation(s) in a locallibrary of operations (or similar operation source). For example, thedynamic system 302 may convey to the computation environment 306 theindicator identifying that the inquiry was the income verificationrequest. Based on the received indicator, the computation environment306 may obtain the appropriate operation(s) to perform on data fragmentsreceived from the partners 304. Similarly, partner systems may receivesoftware updates or other executable instructions from the service or anoperator of the service that enable the partners, such as via theclient-side module of the partner, to implement the appropriateclient-side functionality to generate fragments for a given inquirytype.

When the service 302 is a static service, the service 302 may not beconfigured to handle different types of inquiries. Thus, the staticservice 302 may not change the type of inquiries it can process.However, the static service 302 may still send an indicator to thepartner 304 to indicate a type of service that the static service 302 isperforming so that the partner 304 applies the appropriate operations.The static service 302 may provide the operation(s) in the indicator ormerely identify the appropriate operation(s) for the partner 304 toobtain from a local library (or similar operation source).

In some embodiments, the service 302 (for example, via the interface #1or the interface #2) will receive a request or inquiry, for example fromthe requesting device 102 and the requesting entity 202. The interface#2 may send the PII related to the inquiry to the orchestrator of theservice 302, which will then resolve the PII to obtain the identifiers(hashed or otherwise) associated with the PII with the identityresolution service 303. The orchestrator may then send the obtainedidentifiers to all CSMs 212. In some embodiments, each CSM 212 receivesa different identifier or hashed value that is associated with the samePII but that will only be understandable to appropriate CSMs 212. Eachof the CSMs 212 may then query the local database for their respectivepartner 304 for relevant records or transactions and receive thecorresponding records and transactions. Additionally, the CSMs 212 mayprocess the received records and transactions to identify a valuerelated to the inquiry based on the local information only. The CSMs 212may then fragment the values and send the fragments of the values to thecomputation environment 306. The computation environment 306 may computecombined attributes based on the fragments received from each of theCSMs 212 and return a result to the orchestrator of the service 302 fordistribution to the requesting device 102 and requesting entity 202 asappropriate.

2.2. Partner

Each partner 304 may include one or more components or modules (forexample, a client side module, “CSM”) installed in its environment. TheCSM may correspond to the processor 212 of FIG. 2 and perform processingrelated to the inquiry locally (for example, local to the partner 304)to reduce computations performed by the computation nodes in thecomputation environment 306 (for example, the computation nodes 104a-104 d). The CSM 212 may responds to requests from the service 302 toinitiate the client-side income calculation, for example when the CSM212 receives the hashed ID from the service 302. The CSM 212 thenqueries a deposit database of the partner 304 to identify and/orretrieve transactions associated with the hashed ID. The CSM 212 maythen use a machine learning based solution to identify income relatedtransactions. In some aspects, the machine learning based solution ispre-trained based on a large volume of deposit transactions collectedfrom a wide range of financial institutions to capture the variations inthe data and enable the machine learning based solution to appropriateidentify and retrieve corresponding income related transactions.Furthermore, the machine learning based solution can be fine-tuned withspecific data for that particular financial institution (e.g., partner304) if said specific data is determined to be beneficial to theidentification and/or retrieval of income related transactions. Themachine learning based solution (for example, an income identifyingalgorithm) classifies the income transactions and summarizes as much aspossible at the client-level for that particular financial institutionor partner 304. These local results are then fragmented, based on themathematical design at the core of SMPC, into unrecognizable pieces. Thefragmented pieces are then sent to the computation environment 306 foraggregation. As the information of the data has been obfuscated andfragmented before leaving the premise of the computation environment ofeach partner 304, interception of the individual pieces, or, evenmultiple pieces of the fragmented data cannot be used to recover theoriginal data.

Each partner 304 may include the CSM 212, which is configured tointerface with databases and/or local computing systems of the partner304 and carry out any needed computations (for example, datafragmentation, etc.) locally, thus reducing risk of confidential orprivate information being communicated away from the partner 304. Insome embodiments, the CSM 212 for each partner may authenticate itselfwith the orchestrator and/or the computing environment 306 to ensure nocompromises exist in the service platform.

As described herein, the CSM 212 of each partner 304 may correspond tolocal processing resources 212 or networks of the partner 304. As such,the CSM 212 may be tasked with providing all local processing ofinformation from the partner 304, whether that be for credit scoreapplication, income verification applications, and so forth. As such,the CSM 212 may receive indications of different operations to performin the local processing. In some embodiments, the CSM 212 may receivepush updates (or similar updates) indicating which appropriateoperations the CSM 212 should apply to the partner data. For example,the service 302 may provide a push update to the CSM 212 to indicate thetype of request received. The push update from the service 302 mayinclude the appropriate operations for the CSM 212 or may provide anidentifier that the CSM 212 uses to obtain the appropriate operationsfrom a local library. Thus, the same CSM 212 for the partner 304 can beused to provide data fragments to the computation environment 306 fordifferent requests (for example, for an income verification request, acredit score request, and so forth). The CSM 212 may use the pushupdates to ensure that the CSM 212 is using the appropriate operationssuch that the data fragments can be used by the computation environment306 to generate a response to the inquiry. In some embodiments, eachpartner 304 comprises a single CSM 212 that is used for all localprocessing performed to generate the data fragments for distribution tothe computation environment 306. Thus, the CSM 212 may be configured toupdate the operations (for example, in response to updates or indicatorsfrom the services) it can perform on the partner data to provide thedata fragments. In some embodiments, each partner comprises multipleCSMs 212 where CSMs 212 do not need to update their operations based ondifferent inquiry types, and so forth.

2.3. Computation Environment

The computation environment 306 jointly computes a function over inputsfrom the partners 304 while keeping those inputs private from each ofthe partner 304 and from the computation environment 306 itself by usinga group of computation nodes “CN” 104. Such computations guarantee:

-   -   1) Input privacy, ensuring that no information about the private        data held by one of the partners 304 can be inferred from the        messages sent during the execution of the protocol unless more        than half of the CNs 104 are compromised; and    -   2) Correctness, based on the service platform and associated        architecture is designed to handle up to n/2−1 compromised HNs,        meaning that as long as a majority of CNs 104 execute the        protocol faithfully, the results are guaranteed to be correctly        computed. In the event that a majority of CNs 104 is        compromised, the protocol design guarantees that the remaining        uncompromised CNs 104 will detect the comprise with overwhelming        probability.

Based on the architecture shown in FIG. 2 , the CNs 104 may actively orpassively wait for the fragmented data sent from each partner 304 andbefore beginning their computations. As the computations are beingperformed, none of the intermediate results are available to anyindividual CNs 104 or the partners 304 that submitted data. Only uponthe completion of the entirety of the computations is the result“revealed” and communicated directly to the service 302 forcommunication to the inquiring entity, for example via an interface orthe third-party application. In some embodiments, as shown in FIG. 2 ,the CNs 104 may be arranged in a plurality of groups. Such anarrangement may improve efficiencies in scalability and robustness ofthe computation environment 306.

In some embodiments, the services and/or methods provided by the serviceplatform are enhanced by a sharing of information between each partner304 and the service 302. For example, each partner 304 may synchronizeits client IDs with the services hashed IDs so that the partner 304knows what client corresponds to hashed IDs received from the service302 when an inquiry is being processed. This may comprise the partner304 extracting the PHs and corresponding customer ID/account ID from itsrecords and/or databases and sending them to the service via securedmanner. In response, the partner 304 receives a mapping table or similarstructure that maps of the partner's customer ID/account ID to one ofthe hashed/salted IDs from the service 302. No PII will be returncommunicated to minimize security risks. The returned salted/hashed IDwill be unique for each individual customer ID/account ID for eachpartner and between different partners so the participating partners 304will not be able to reference each other's data via received hashed IDs.

The computation environment 306 implements an SMPC framework in a hostedfashion, thereby reducing a complexity of requiring each partner 304implement its own computation environment. Each of the CNs 104 in aComputation Group (HCG) receives different fragments of information fromthe CSM 212 deployed in each partner 304 and never combine theinformation together throughout any calculations.

In some embodiments, the computation environment is hosted asPlatform-as-a-Service (PaaS). Accordingly, one or more aspects of thecomputation environment 306 is implemented as a micro-service. Byimplementing the computation environment as a PaaS, management ofindividual resources is simplified and automatic restart in the event offailure is provided.

In some embodiments, the computation environment 306 hosted as the PaaScomprises a plurality of SMPC engines, where each CN 104 or group of CNs104 comprises an SMPC computation engine. In some embodiments, the SMPCcomputation engines are run separately in CNs 104 for security purpose.Each of the SMPC engines, therefore, only receives a fragment of anyinformation from the partners 304 so that no nodes at any time canre-assemble/recover the complete information from the fragments theyreceived. The CNs 104 carry out the calculations using these fragmentsfrom the partners 304 using Secure Multi-Party Computation protocols toachieve the highest level of security. In addition of the fragmentationconcept, communication channels between the engines/CNs 104 are securedvia one or more protocols.

In some embodiments, the output of the computation environment 306 maybe a response to the initial inquiry or may be an aggregated value basedon the fragments received from each of the partners 304A-304C. In someembodiments, the output of the computation environment 306 may be usedin additional computations or analysis as applicable.

As each partner may store data in distinct formats, etc., the service302 may request and/or require that each partner format or map its datain a manner that is understandable by the CSM 212 for each partner 304.For example, in one embodiment, the service 302 may require that eachpartner 304 include in its databases at least the most recent 24 monthsdeposit transactions of all the customers and have at a minimum numberof other fields. For example, fields may include information such asunique customer identifier, account identifier, account type, primaryaccount holder, transaction type(s), transaction time(s), transactionamount(s), transaction description(s), whether a transaction has posted,account numbers that the transaction is from and to, whether a transferis intra-bank or inter-bank, and/or others. Additional fields may beadded and fields may be removed for a given implementation orembodiment.

As a result of the service platform described herein, the responseprovided back to the inquiring entity is anonymous, and no nodes alongthe way are able to parse the data to identify private information fromindividual partners. Additionally, since the partner IDs and the serviceIDs are synchronized, no PII is shared as part of the inquiries.Furthermore, no data is transferred and/or shared between differentpartners 304 and/or between each partner 304 and the service 302.Furthermore, the transactions, etc., can be tracked via an immutableprivate ledger provided in blockchain implementations.

Various challenges are overcome by the described systems and methods.For example, the complexity of the service platform and potential issuesin robustness of the service platform are improved by enabling automaticrestarts and introducing heartbeats between various components in theservice platform while decoupling components to make them as autonomousas possible. Performance of the service platform may be improved byenabling multiple CN groups and by enabling queuing of inquiries orrequests while the computation environment 306 is handling a previousrequest or inquiry.

FIG. 4 shows an exemplary deployment and security architecture for theservice platform of FIG. 1 . In addition to leveraging the securityfeatures inherent in Secure Multi-Party Computation that guarantees noneof the partner's information is revealed to individual CNs 104 in thecomputation environment 306, nor to any of other partners 304, thedeployment and security architecture of FIG. 4 is implemented.

FIG. 4 shows how the deployment and security architecture is connectedand authenticated according to some embodiments. For deployment of theCSM 212 at a partner 304, the CSM 212 may conform to the partner 304security standards. As such, each partner 304 may include a keymanagement server (KSM) that stores, in an encrypted format, all secretinformation, such as private keys and passwords. In order for the CSM212 to gain access to the secret information in the KSM, the CSM may beauthenticated by an authentication service. In some embodiments, the KMSis optional, dependent on the partner 304 security standards.

All components on the service platform 100 side may be deployed in aparticular environment (for example, a platform as a service (PAAS)environment) and leverage corresponding services (for example, abuilt-in secret management service to store and access secureinformation).

In some embodiments, the requesting entity 202 may use the requestingdevice 102 to interact with the service platform 100. For example, therequesting device 102 may interact with the service platform 100 via aweb-based application, a proxy interface, and so forth. In someembodiments, the interface is hosted by the service platform 100.Communications between the requesting device 102 and the interface issecured using HTTPS or a similar scheme. The requesting entity 202 usingthe requesting device 102 may be authenticated against a credentialservice of the service platform 100. In some embodiments, suchauthentication comprises the requesting entity 202 to provide a username, a password, a client identifier, and a client secret provisionedto obtain an access token (for example, a Java Web Token (JWT)). Theaccess token is then verified by the interface proxy to grant access toa service interface (for example, an income verification service (IVS)interface). In some embodiments, the access token is set to expire aftera threshold period of time. For example, when the requesting entity 202first logs into the service platform 100, a 24-hour life time renewaltoken is issued. The interface can then use the renewal token to renewthe access token up to 24 hours. In some embodiments, the third partyinterface is similarly authenticated and authorized via access tokenmechanism to gain access to the service platform API via the interfaceproxy.

In some embodiments, communications between the CSM 212 and theorchestrator are secured by secure protocol, for example a websocketprotocol. The CSM 212 authentication may be performed via theauthentication services. In some embodiments, CSM 212 a credentialauthentication flow, for which a long-life, or even perpetual accesstoken is typically granted, may not approved. Instead, a Resource OwnerPassword Credential Grant (authentication server) may be used toauthenticate the CSM 212. For example, before the CSM 212 is allowed toconnect to the orchestrator, the CSM 212 may be authenticated againstthe service platform credential service to obtain an access token. Theaccess token may then be sent in an authentication header (for example,an HTTP authentication header) when establishing the secure protocolconnection to orchestrator. The orchestrator may verify the access tokenbefore accepting the connection with the CSM 212. Otherwise, theconnection between the CSM 212 and the orchestrator may be terminated ortorn down. When the access token expires (for example, after thethreshold time expires, e.g., 30 minutes), the orchestrator terminatesthe secure protocol connection to force the CSM 212 to obtain a newaccess token and re-establish the secure protocol connection.

In some embodiments, communications between the CSM 212 and computationnodes 104 is secured by HTTPS, or a similar scheme. Before the CSM 212connects to the computation nodes 104, the CSM 212 may be authenticatedagainst the service platform credential service to obtain an accesstoken that is valid for a threshold period of time. After that, the CSM212 can use a renewal token to retrieve a new access token for up to 24hours. Whenever connecting to the computation node 104, the CSM 212 maysend an authentication header with the access token. The computationnode 104 may validate the access token to establish communications withthe CSM 212. Otherwise, the HTTPS (or similar) connection is terminatedor torn down. In some embodiments, the threshold period is 30 minutes,such that every 30 minutes, the CSM 212 has to renew its access tokenusing a renewal token and every 24 hours, the CSM 212 has tore-authenticate to restart the renewal cycle. In some embodiments, theCSM 212 credentials are provided by EWACS. In some embodiments, theinterface proxy is leveraged to provision the client identifier andclient secret, described herein.

In some embodiments, the computation nodes 104 are hosted on the PAAS orsimilar environment of the service platform 100. As such, the platformmay comprise one or more components, in a data center, with all thesecurity components consistent with the service protocol securitystandards. The computation nodes 104 may be run separately in multiplenodes for security purposes. For example, each of the computation nodes104 only receives a fragment of any information from the partner 304 sothat no nodes or servers at any time can re-assemble/recover thecomplete information from the fragments they received. The SMPC platformmay carry out calculations using these fragments from the partners 304using the SMCP protocols and operations to achieve high levels ofsecurity (or the highest level of security). In addition of thefragmentation concept, communication channels between the computationnodes 104 may be secured via mutual authentication (for example, a TLSprotocol) using private keys managed by the PAAS secret manager.

FIG. 5A shows a first exemplary distributed system that implements theservice platform of FIG. 1 . In one example, the distributed system isbuilt on synchronous core with asynchronous peripherals. For example,the SMPC engines work synchronously among themselves. The remainingcomponents of the system perform calculations independently in adistributed, heterogeneous system with no or little coordination betweenpartner 304 CSMs 212.

Such a structure may introduce potential issues. For example,communication link or component failures may cause a message to bereceived by the SM PC engines out of order, which may cause the entiresystem to fail.

FIG. 5B shows a second exemplary distributed system that implements theservice platform of FIG. 1 , according to one embodiment. In thisexample, the distributed system decouples components as much aspossible. For example, the CSM 212 at each partner 304 connects asneeded to the computation environment 306 and to the internal databasesof the partner 304. However, the CSM 212 may maintain communication withthe service 302 via the heartbeat. The CNs 104 in the computationenvironment 306 each maintain “all-or-nothing” states. Additionally,they each have separate timers to ensure individual CNs 104 or groups ofCNs 104 recover from bad inquiries or inquiries that are not fullycomputed (for example, not receiving an input from an expected partner304 CSM 212 or receiving an input late). The computation environment 306may report the error and then reset to accept the next inquiry.

In some embodiments, the CSM 212 at each partner 304 may detect agentaliveness (Alive/Dead) via heartbeat mechanism. The CN 104 availabilitymay be tracked by periodically probing (synthetic inquiry) to detectaliveness of CN groups or individual CNs 104, maintaining states(Busy/Free/Dead) of individual CNs 104 or CN groups to distributeinquiry loads, distributing inquiries to multiple CNs 104 or CN groups,and queuing inquiries when none of the CNs 104 or CN groups are free.

Example Credit Scoring Use Case

In credit scoring operations, generating a credit score may utilize a“score” function as a general score card, a logistic regression, or, amachine learning-based model such as gradient boosted decision trees. Acredit score for a consumer may be calculated by first mapping each of anumber of data elements in each individual trade of the consumer. Then,based on the mapped information, a set of “attributes” are calculatedfor the consumer (such as attributes related to credit limits,percentage of credit used, revolving account balances, number ofdelinquent accounts, etc.), as is known in the field of credit scoring.Finally, a credit score is calculated for the consumer using a scoringfunction and all the corresponding attributes.

However, such processing may be time consuming and difficult to completegiven privacy and data security concerns, where multiple entities areunwilling to share their data regarding the consumer with each other ora centralized aggregator. Instead, as noted above, decentralizedoperations using the framework and/or service platform 100 of FIG. 1 mayinstead be used to implement secure credit score generation based ondata from the multiple entities.

For example, following the algorithm described with relation toEquations 1.0 and 2.0 above, the algorithm formulates a centralizedversion of an income verification/estimation operation as in Equation6.0 below.

$\begin{matrix}{{{{Credit}{score}} = {{Combine}\left( {{‘{consumer}’},{score},{{Combine}\left( {{‘{consumer}’},a_{1},{{Map}\left( {{‘{trade}’},l_{a_{1},1},X} \right)},{{Map}\left( {{‘{trade}’},l_{a_{1},2},X} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{1},j_{a_{1}}},X} \right)}} \right)},{{Combine}\left( {{‘{consumer}’},a_{2},{{Map}\left( {{‘{trade}’},l_{a_{2},1},X} \right)},{{Map}\left( {{‘{trade}’},l_{a_{2},2},X} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{2},j_{a_{2}}},X} \right)}} \right)},{\ldots\ldots\ldots},{{Combine}\left( {{‘{consumer}’},a_{k},{{Map}\left( {{‘{trade}’},l_{a_{k},1},X} \right)},{{Map}\left( {{‘{trade}’},l_{a_{k},2},X} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{k},j_{a_{k}}},X} \right)}} \right)}} \right)}},} & {{Equation}6.}\end{matrix}$

As described above, the algorithm decomposes the centralized solution inEquation 6.0 to generate a decentralized solution by allowing for thedata to be from each of the data providers, as noted in Equations 7.0and 7.1 below:

$\begin{matrix}{{{Credit}{score}} = {{Combine}\left( {{‘{consumer}’},{score},{{Combine}\left( {{‘{consumer}’},a_{1},{{Map}\left( {{‘{trade}’},l_{a_{1},1},X_{B_{1} - B_{n}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{1},2},X_{B_{1} - B_{n}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{1},j_{a_{1}}},X_{B_{1} - B_{n}}} \right)}} \right)},{{Combine}\left( {{‘{consumer}’},a_{2},{{Map}\left( {{‘{trade}’},l_{a_{2},1},X_{B_{1} - B_{n}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{2},2},X_{B_{1} - B_{n}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{2},j_{a_{2}}},X_{B_{1} - B_{n}}} \right)}} \right)},{\ldots\ldots\ldots},{{Combine}\left( {{‘{consumer}’},a_{k},{{Map}\left( {{‘{trade}’},l_{a_{k},1},X_{B_{1} - B_{n}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{k},2},X_{B_{1} - B_{n}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{k},j_{a_{k}}},X_{B_{1} - B_{n}}} \right)}} \right)}} \right)}} & {{Equation}7.} \\{{{credit}{score}} = {{Combine}\left( {{‘{consumer}’},{score},{{Combine}\begin{pmatrix}{{‘{consumer}’},a_{1},{{Map}\left( {{‘{trade}’},l_{a_{1},1},X_{B_{1}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{1},2},X_{B_{1}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{1},j_{a_{1}}},X_{B_{1}}} \right)},} \\{{{Map}\left( {{‘{trade}’},l_{a_{2},1},X_{B_{2}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{2},2},X_{B_{2}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{2},j_{a_{2}}},X_{B_{2}}} \right)},} \\{{\ldots\ldots\ldots},} \\{{{Map}\left( {{‘{trade}’},l_{a_{1},1},X_{B_{n}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{1},2},X_{B_{n}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{1},j_{a_{1}}},X_{B_{n}}} \right)}}\end{pmatrix}},{{Combine}\begin{pmatrix}{{‘{consumer}’},a_{2},{{Map}\left( {{‘{trade}’},l_{a_{2},1},X_{B_{1}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{2},2},X_{B_{1}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{2},j_{a_{2}}},X_{B_{1}}} \right)},} \\{{{Map}\left( {{‘{trade}’},l_{a_{2},1},X_{B_{2}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{2},2},X_{B_{2}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{2},j_{a_{2}}},X_{B_{2}}} \right)},} \\{{\ldots\ldots\ldots},} \\{{{Map}\left( {{‘{trade}’},l_{a_{2},1},X_{B_{n}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{2},2},X_{B_{n}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{2},j_{a_{2}}},X_{B_{n}}} \right)}}\end{pmatrix}},{\ldots\ldots\ldots},{{Combine}\begin{pmatrix}{{‘{consumer}’},a_{k},{{Map}\left( {{‘{trade}’},l_{a_{k},1},X_{B_{1}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{k},2},X_{B_{1}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{k},j_{a_{1}}},X_{B_{1}}} \right)},} \\{{{Map}\left( {{‘{trade}’},l_{a_{k},1},X_{B_{2}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{k},2},X_{B_{2}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{k},j_{a_{1}}},X_{B_{2}}} \right)},} \\{{\ldots\ldots\ldots},} \\{{{Map}\left( {{‘{trade}’},l_{a_{k},1},X_{B_{n}}} \right)},{{Map}\left( {{‘{trade}’},l_{a_{k},2},X_{B_{n}}} \right)},\ldots,{{Map}\left( {{‘{trade}’},l_{a_{k},j_{a_{1}}},X_{B_{n}}} \right)}}\end{pmatrix}}} \right.}} & {{Equation}7.1}\end{matrix}$

The algorithm then identifies the C_or_M_(local) and C_or_M_(secure)operations for the Equation 7.1. For example, the algorithm identifiesthe Map functions in Equation 7.1 to be executable locally in the dataproviders and the Combine functions to be executed in the SMPC platformwhen they are not decomposable. This example is described in more detailbelow.

The service platform 100 of FIG. 1 may compute a consumer's creditscore. The specifically, the computation nodes 104 a-104 d of the SMPCplatform may comprise one or more components used to compute theconsumer's credit score (for example, Bob Dixon's credit score). Forexample, the SMPC platform may have access to one or more secure creditscore models and the secure attributes as provided by the dataproviders. The credit score models may enable the computation nodes 104a-104 d of the SMPC platform to calculate a consumer's (for example, BobDixon's) credit score based on the data fragments provided by the dataproviders. In some embodiments, the SMPC platform may comprise one ormore layers of functions that ensure security of calculations of thecredit attributes and scores.

For example, a filter layer of the SMPC platform may comprise functionsthat calculate summary statistics on consumer trades for creditattribute calculation. However, such calculations may be based primarilyon information from a single data provider; as such, that data providerhaving the information may perform the calculations locally. The localprocessing may improve times involved to perform calculations byreducing times for communicating the information on which thecalculations are based. The filter layer may provide such filters forthe local agents or processors 212 of the data providers that arecommonly used in attribute calculation. In some embodiments, the filterlayer works alongside a virtual machine layer or similar tool thatimplements secure computations by the computation nodes 104 a-104 d.

The SMPC platform may comprise the secure data adapter layer. The securedata adapter layer may provide an adapter that connects betweendatabases of different data providers and the computation nodes of theSMPC platform (for example, the secure connection described above withreference to FIG. 1 ). In some embodiments, the secure connection maycreate a secure connection between the computation nodes 104 a-104 d andthe institutions 106 a-106 g to maintain data security and privacy. Insome embodiments, the secure data adapter layer sits on top of (forexample, connects to) the virtual machine and/or the filter layer.

In some embodiments, the SMPC platform comprises a secure commonfunctions layer. The secure common functions layer comprises one or morecalculations that may be used to calculate the credit attributes. Thesecure common functions layer may provide a variety of functions tocalculate statistics, for example, average, minimum, maximum, sum,count, and so forth across a given set of trades for a consumer. Eachfunction allows for initialization, and capping under customizedconditions. In some embodiments, all the calculations are carried outsecurely under the SMPC protocol without revealing any individual dataprovider's private data. In some embodiments, the secure commonfunctions layer sits on top of (for example, connects to) a virtualmachine and/or a filter layer alongside the secure data adapter layer.

The SMPC platform may include a secure attributes layer. The secureattributes layer may implement a number of secure attributes (forexample, over 100 secure attributes). The SMCP platform may furthercomprise a secure credit score model layer that provides capabilitiesfor calculating a credit score for the consumer securely based on secureattributes. The secure credit score model layer sits on top of (forexample, connects to) the secure attributes layer.

Such an implementation for credit score determinations for a consumermay improve and/or overcome various issues of data aggregators, forexample reducing calculation times and enabling real-time calculations.Different numbers of computation nodes in the SMPC platform may impactcalculation times. For example, as the number of computation nodes 104increases, the communication costs may increase. Thus, as morecomputation nodes 104 are included in the SMPC platform, a number ofcomputations that can be handled in a given timeframe may reduce whencomputing in real time. However, increasing the number of computationnodes 104 may increase a number of consumers for which the SMPC canprocess credit scores in a given amount of time, when computing inbatch.

These results demonstrate a feasibility of the SMPC platform andcorresponding service platform 100 for calculating the credit attributesand scores real-time and/or batch. In some embodiments, having thecomputation nodes disposed in different networks (for example, differentnetworks 108) may change how quickly the above computations can beperformed. Additionally, increasing the number of computation nodeswhile improving communication costs may reduce computation times.

3. Example System Implementation and Architecture

FIG. 6 is a block diagram showing example components of a transactiondata processing system 1000. The system 1000 or variations thereof maybe used, in some embodiments, as part of the service 302, to implementcomputation nodes in an SMPC arrangement and/or by a partner toimplement CSM 212 functionality. The processing system 1000 includes,for example, a personal computer that is IBM, Macintosh, or Linux/Unixcompatible or a server or workstation. In one embodiment, the processingsystem 1000 includes a server, a laptop computer, a smart phone, apersonal digital assistant, a kiosk, or a media player, for example. Inone embodiment, the processing system 1000 includes one or more centralprocessing unit (“CPU”) 1005, which may each include a conventional orproprietary microprocessor specially configured to perform, in whole orin part, one or more of the machine learning recommendation/result modelfeatures described above. The processing system 1000 further includesone or more memory 1032, such as random access memory (“RAM”) fortemporary storage of information, one or more read only memory (“ROM”)for permanent storage of information, and one or more mass storagedevice 1022, such as a hard drive, diskette, solid state drive, oroptical media storage device. A specially architected transaction datastore 1008 may be provided. The transaction data store 1008 may beoptimized for storing raw and/or compressed transaction data as well asrecommendation/result modeling data as described above. In someimplementations, the transaction data store 1008 may be designed tohandle large quantities of data and provide fast retrieval of therecords. To facilitate efficient storage and retrieval, the transactiondata store 1008 may be indexed using one or more of compressedtransaction data, user identifiers, transaction category, merchantidentifiers, or other data such as described above.

Typically, the components of the processing system 1000 are connectedusing a standards-based bus system 1090. In different embodiments, thestandards-based bus system 1090 could be implemented in PeripheralComponent Interconnect (“PCI”), Microchannel, Small Computer SystemInterface (“SCSI”), Industrial Standard Architecture (“ISA”) andExtended ISA (“EISA”) architectures, for example. In addition, thefunctionality provided for in the components and modules of processingsystem 1000 may be combined into fewer components and modules or furtherseparated into additional components and modules.

The processing system 1000 is generally controlled and coordinated byoperating system software, such as Windows XP, Windows Vista, Windows 7,Windows 8, Windows Server, Unix, Linux, SunOS, Solaris, iOS, BlackberryOS, Android, or other compatible operating systems. In Macintoshsystems, the operating system may be any available operating system,such as MAC OS X. In other embodiments, the processing system 1000 maybe controlled by a proprietary operating system. The operating system isconfigured to control and schedule computer processes for execution,perform memory management, provide file system, networking, I/Oservices, and provide a user interface, such as a graphical userinterface (“GUI”), among other things.

The processing system 1000 may include one or more commonly availableinput/output (I/O) devices and interfaces 1012, such as a keyboard,mouse, touchpad, and printer. In one embodiment, the I/O devices andinterfaces 1012 include one or more display devices, such as a monitor,that allows the visual presentation of data to a user. Moreparticularly, a display device provides for the presentation of GUIs,application software data, and multimedia presentations, for example.The processing system 1000 may also include one or more multimediadevices 1042, such as speakers, video cards, graphics accelerators, andmicrophones, for example.

In the embodiment of FIG. 6 , the I/O devices and interfaces 1012provide a communication interface to various external devices. Theprocessing system 1000 may be electronically coupled to one or morenetworks, which comprise one or more of a LAN, WAN, cellular network,satellite network, and/or the Internet, for example, via a wired,wireless, or combination of wired and wireless, communication link. Thenetworks communicate with various computing devices and/or otherelectronic devices via wired or wireless communication links, such asthe credit bureau data source and financial information data sources.

In some embodiments, information may be provided to the processingsystem 1000 over a network from one or more data sources. The datasources may include one or more internal and/or external data sourcesthat provide transaction data, such as credit issuers (e.g., financialinstitutions that issue credit cards), transaction processors (e.g.,entities that process credit card swipes at points of sale), and/ortransaction aggregators. The data sources may include internal andexternal data sources which store, for example, credit bureau data (forexample, credit bureau data) and/or other user data. In someembodiments, one or more of the databases or data sources may beimplemented using a relational database, such as Sybase, Oracle,CodeBase and Microsoft® SQL Server as well as other types of databasessuch as, for example, a flat file database, an entity-relationshipdatabase, and object-oriented database, and/or a record-based database.

In general, the word “module,” as used herein, refers to logic embodiedin hardware or firmware, or to a collection of software instructions,possibly having entry and exit points, written in a programminglanguage, such as, for example, Java, Lua, C or C++. A software modulemay be compiled and linked into an executable program, installed in adynamic link library, or may be written in an interpreted programminglanguage such as, for example, BASIC, Perl, or Python. It will beappreciated that software modules may be callable from other modules orfrom themselves, and/or may be invoked in response to detected events orinterrupts. Software modules configured for execution on computingdevices may be provided on a computer readable medium, such as a compactdisc, digital video disc, flash drive, or any other tangible medium.Such software code may be stored, partially or fully, on a memory deviceof the executing computing device, such as the processing system 1000,for execution by the computing device. Software instructions may beembedded in firmware, such as an EPROM. It will be further appreciatedthat hardware modules may be comprised of connected logic units, such asgates and flip-flops, and/or may be comprised of programmable units,such as programmable gate arrays or processors. The modules describedherein are preferably implemented as software modules. They may berepresented in hardware or firmware. Generally, the modules describedherein refer to logical modules that may be combined with other modulesor divided into sub-modules despite their physical organization orstorage.

In the example of FIG. 6 , the modules 1010 may be configured forexecution by the CPU 1005 to perform, in whole or in part, any or all ofthe process discussed above, such as those shown in FIGS. 1, 2, 3, 4,5A, and 5B.

4. Additional Embodiments

Each of the processes, methods, and algorithms described in thepreceding sections may be embodied in, and fully or partially automatedby, code modules executed by one or more computer systems or computerprocessors comprising computer hardware. The code modules may be storedon any type of non-transitory computer-readable medium or computerstorage device, such as hard drives, solid state memory, optical disc,and/or the like. The systems and modules may also be transmitted asgenerated data signals (for example, as part of a carrier wave or otheranalog or digital propagated signal) on a variety of computer-readabletransmission mediums, including wireless-based and wired/cable-basedmediums, and may take a variety of forms (for example, as part of asingle or multiplexed analog signal, or as multiple discrete digitalpackets or frames). The processes and algorithms may be implementedpartially or wholly in application-specific circuitry. The results ofthe disclosed processes and process steps may be stored, persistently orotherwise, in any type of non-transitory computer storage such as, forexample, volatile or non-volatile storage.

The various features and processes described above may be usedindependently of one another, or may be combined in various ways. Allpossible combinations and sub-combinations are intended to fall withinthe scope of this disclosure. In addition, certain method or processblocks may be omitted in some implementations. The methods and processesdescribed herein are also not limited to any particular sequence, andthe blocks or states relating thereto can be performed in othersequences that are appropriate. For example, described blocks or statesmay be performed in an order other than that specifically disclosed, ormultiple blocks or states may be combined in a single block or state.The example blocks or states may be performed in serial, in parallel, orin some other manner. Blocks or states may be added to or removed fromthe disclosed example embodiments. The example systems and componentsdescribed herein may be configured differently than described. Forexample, elements may be added to, removed from, or rearranged comparedto the disclosed example embodiments.

Conditional language, such as, among others, “can,” “could,” “might,” or“may,” unless specifically stated otherwise, or otherwise understoodwithin the context as used, is generally intended to convey that certainembodiments include, while other embodiments do not include, certainfeatures, elements and/or steps. Thus, such conditional language is notgenerally intended to imply that features, elements and/or steps are inany way required for one or more embodiments or that one or moreembodiments necessarily include logic for deciding, with or without userinput or prompting, whether these features, elements and/or steps areincluded or are to be performed in any particular embodiment.

Any process descriptions, elements, or blocks in the flow diagramsdescribed herein and/or depicted in the attached figures should beunderstood as potentially representing modules, segments, or portions ofcode which include one or more executable instructions for implementingspecific logical functions or steps in the process. Alternateimplementations are included within the scope of the embodimentsdescribed herein in which elements or functions may be deleted, executedout of order from that shown or discussed, including substantiallyconcurrently or in reverse order, depending on the functionalityinvolved, as would be understood by those skilled in the art.

All of the methods and processes described above may be embodied in, andpartially or fully automated via, software code modules executed by oneor more specially configured general purpose computers. For example, themethods described herein may be performed by a processing system, cardreader, point of sale device, acquisition server, card issuer server,and/or any other suitable computing device. The methods may be executedon the computing devices in response to execution of softwareinstructions or other executable code read from a tangible computerreadable medium. A tangible computer readable medium is a data storagedevice that can store data that is readable by a computer system.Examples of computer readable mediums include read-only memory,random-access memory, other volatile or non-volatile memory devices,compact disk read-only memories (CD-ROMs), magnetic tape, flash drives,and optical data storage devices.

It should be emphasized that many variations and modifications may bemade to the above-described embodiments, the elements of which are to beunderstood as being among other acceptable examples. All suchmodifications and variations are intended to be included herein withinthe scope of this disclosure. The foregoing description details certainembodiments. It will be appreciated that no matter how detailed theforegoing appears in text, the systems and methods can be practiced inmany ways. As is also stated above, it should be noted that the use ofparticular terminology when describing certain features or aspects ofthe systems and methods should not be taken to imply that theterminology is being re-defined herein to be restricted to including anyspecific characteristics of the features or aspects of the systems andmethods with which that terminology is associated.

Further detail regarding embodiments relating to the systems and methodsdisclosed herein, as well as other embodiments, is provided in theAppendix of the present application, the entirety of which is bodilyincorporated herein and the entirety of which is also incorporated byreference herein and made a part of this specification. The Appendixprovides examples of features that may be provided by a system thatimplements at least some of the functionality described herein,according to some embodiments, as well as specific system configurationand implementation details according to certain embodiments of thepresent disclosure.

1-20. (canceled)
 21. A system comprising: memory; and at least onecomputing device configured with computer-executable instructions that,when executed, cause the at least one computing device to performoperations comprising at least: receiving an inquiry from a requestingentity system requesting generation or calculation of informationregarding an individual based on pieces of protected information held bya plurality of data provider systems; based on the inquiry,communicating requests to each of a plurality of computation nodes togenerate encrypted data fragments associated with the inquiry, whereinthe requests each include an identifier corresponding to the individual,wherein each of the plurality of computation nodes is associated with arespective one of the data provider systems and is designated to processone or more of the pieces of protected information held by therespective data provider; receiving, in response to the requests, theencrypted data fragments from the plurality of computation nodes withoutreceiving any of the protected information, wherein a first subset ofthe encrypted data fragments received from a first computation node aredifferent than a second subset of the encrypted data fragments receivedfrom a second computation node, wherein the first subset of theencrypted data fragments have been generated based on one or moredifferent pieces of protected information than the second subset of theencrypted data fragments have been generated based on, wherein encrypteddata fragments generated by each computation node cannot be re-assembledby any other computation node to recover corresponding protectedinformation; performing secure multi-party computations based on theencrypted data fragments received from each of the plurality ofcomputation nodes; generating a result related to the individual basedon the secure multi-party computations; and generating and transmitting,to the requesting entity system, a response to the inquiry based on theresult related to the individual.
 22. The system of claim 21, whereinthe plurality of data provider systems comprises one or more of: afinancial institution, a healthcare institution, or a consumer datainstitution.
 23. The system of claim 21, further comprising anidentifier database, wherein the at least one computing device isfurther configured to: identify respective identifiers of the individualfor each of the plurality of data provider systems based on the inquiry,and communicate the respective identifiers to the respective computationnode for each data provider system.
 24. The system of claim 21, whereineach computation node is configured to perform initial computations withrespect to one or more of the pieces of protected information beforegenerating corresponding encrypted data fragments.
 25. The system ofclaim 21, wherein the plurality of computation nodes are clustered orgrouped into two or more clusters or computation groups that are eachassigned to a different one of the plurality of data provider systems.26. The system of claim 25, wherein a quantity of computation nodeswithin an individual cluster or computation group is based on a desiredsecurity level.
 27. The system of claim 21, wherein the inquirycomprises an information verification request comprising verificationinformation to be verified, and wherein the response to the enquiry isan affirmative or negative response.
 28. The system of claim 27, whereinthe at least one computing device is further configured to provide theaffirmative response to the requesting entity system in response to theinquiry when the result verifies the verification information.
 29. Thesystem of claim 21, wherein the inquiry comprises a request to compute acredit score for the individual, and wherein the response comprises thecredit score for the individual.
 30. A computer-implemented methodcomprising: receiving an inquiry from a requesting entity systemrequesting generation or calculation of information regarding anindividual based on pieces of protected information held by a pluralityof data provider systems; based on the inquiry, communicating requeststo each of a plurality of computation nodes to generate encrypted datafragments associated with the inquiry, wherein the requests each includean identifier corresponding to the individual, wherein each of theplurality of computation nodes is associated with a respective one ofthe data provider systems and is designated to process one or more ofthe pieces of protected information held by the respective dataprovider; receiving, in response to the requests, the encrypted datafragments from the plurality of computation nodes without receiving anyof the protected information, wherein a first subset of the encrypteddata fragments received from a first computation node are different thana second subset of the encrypted data fragments received from a secondcomputation node, wherein the first subset of the encrypted datafragments have been generated based on one or more different pieces ofprotected information than the second subset of the encrypted datafragments have been generated based on, wherein encrypted data fragmentsgenerated by each computation node cannot be re-assembled by any othercomputation node to recover corresponding protected information;performing secure multi-party computations based on the encrypted datafragments received from each of the plurality of computation nodes;generating a result related to the individual based on the securemulti-party computations; and generating and transmitting, to therequesting entity system, a response to the inquiry based on the resultrelated to the individual.
 31. The computer-implemented method of claim30, wherein the plurality of data provider systems comprises one or moreof: a financial institution, a healthcare institution, or a consumerdata institution.
 32. The computer-implemented method of claim 30,wherein each computation node is configured to perform initialcomputations with respect to one or more of the pieces of protectedinformation before generating corresponding encrypted data fragments.33. The computer-implemented method of claim 30, wherein the pluralityof computation nodes are clustered or grouped into two or more clustersor computation groups that are each assigned to a different one of theplurality of data provider systems.
 34. The computer-implemented methodof claim 33, wherein a quantity of computation nodes within anindividual cluster or computation group is based on a desired securitylevel.
 35. The computer-implemented method of claim 30, wherein theinquiry comprises an information verification request comprisingverification information to be verified, and wherein the response to theenquiry is an affirmative or negative response.
 36. Thecomputer-implemented method of claim 35, wherein the at least onecomputing device is further configured to provide the affirmativeresponse to the requesting entity system in response to the inquiry whenthe result verifies the verification information.
 37. Thecomputer-implemented method of claim 30, wherein the inquiry comprises arequest to compute a credit score for the individual, and wherein theresponse comprises the credit score for the individual.
 38. Anon-transitory computer storage medium storing computer-executableinstructions that, when executed by one or more processors, cause theone or more processors to perform operations comprising at least:receiving an inquiry from a requesting entity system requestinggeneration or calculation of information regarding an individual basedon pieces of protected information held by a plurality of data providersystems; based on the inquiry, communicating requests to each of aplurality of computation nodes to generate encrypted data fragmentsassociated with the inquiry, wherein the requests each include anidentifier corresponding to the individual, wherein each of theplurality of computation nodes is associated with a respective one ofthe data provider systems and is designated to process one or more ofthe pieces of protected information held by the respective dataprovider; receiving, in response to the requests, the encrypted datafragments from the plurality of computation nodes without receiving anyof the protected information, wherein a first subset of the encrypteddata fragments received from a first computation node are different thana second subset of the encrypted data fragments received from a secondcomputation node, wherein the first subset of the encrypted datafragments have been generated based on one or more different pieces ofprotected information than the second subset of the encrypted datafragments have been generated based on, wherein encrypted data fragmentsgenerated by each computation node cannot be re-assembled by any othercomputation node to recover corresponding protected information;performing secure multi-party computations based on the encrypted datafragments received from each of the plurality of computation nodes;generating a result related to the individual based on the securemulti-party computations; and generating and transmitting, to therequesting entity system, a response to the inquiry based on the resultrelated to the individual.